Statement of Research
نویسنده
چکیده
Currently, I am especially interested in the problem of identifying similarities between high-dimensional datasets. Very often, data may be collected by a number of sources, which may be unable to share their entire datasets for reasons like confidentiality agreements, dataset location and size, etc. If there exists some similar substructure between distinct datasets, this may be exploited. For example, two consumer markets (A and B) differing in geography, economy, political orientation or some other way, may have some unusually similar consumer profiles. This may prompt sales managers in B to use successful sales strategies employed by sales managers in A for consumer profiles in which they are unusually similar. This problem has many application domains including computational biology (finding common structure among proteins given the 3-d coordinates of their amino acids), computer vision (object recognition), dataset evolution, etc. To solve this problem, we have taken a 2-stage approach [8]. The first stage involves generating a condensed model (a graph) for each dataset. The second stage involves the identification of similarities between the condensed models of the datasets, at a central location. The first stage requires identification of the components (vertices in the graph) making up the condensed model and the intra-model inter-component relationships (edge weights). The existing algorithms used to find the model components, use non-intuitive parameters. We have designed a new algorithm SCHISM [6], using statistically sound and intuitive interestingness thresholds. Our paper underlines the inherent drawback of a constant threshold support, as employed by Apriori [1]-based algorithms. SCHISM unifies the intuition underlying pioneering subspace mining algorithms, CLIQUE[2] and MAFIA[3] . It provides insight into the setting of thresholds for Apriori-based algorithms. We have extended such ideas to finding interesting sequences [9]. In the second stage, to find inter-model similarities between components (vertices) in different datasets (graphs), we use network-flow based algorithms, which propagate similarities in structure between the two datasets. These algorithms have been shown to perform better than some of the state of the art algorithms [8]. In the future, I hope to use nonlinear dimensionality reduction techniques to learn underlying manifolds or subspaces. We are also currently working on extending such techniques as used in SCHISM[6] to mining of other interesting patterns, like graphs and trees. I am also interested in the problem of mining temporal sequences. I hope to extend my earlier work [4], which dealt with
منابع مشابه
Analysis the privacy statement of the American Public Libraries and provide privacy statement for public libraries in Iran
Aim: The purpose of this study was to review the privacy statement of the American top public libraries and provide privacy statement for users of public libraries in Iran. Method: The research method is a combination of descriptive survey and Delphi library. The research community consisted of 25 American public libraries based on the rankings of the American Library Association's libraries. T...
متن کاملA Process for Developing the Statement of Internet Research Ethics based on Action Research Method
Background: Research ethics in cyberspace or Internet research ethics (IRE) is a subset of applied ethics that aims to study, introduce, and apply ethical codes for guiding research activities in cyberspace. The compilation of the ethical statement is based on two methods of documentary research and action research. The action research process is implemented in four stages: 1) diagnosis, 2) act...
متن کاملمعرفی دستورالعمل ارتقاء گزارشدهی مطالعات مشاهدهای در اپیدمیولوژی
Background and Objective: Studies in the health sciences is comprised of observational and intervention. A major part of health sciences research has been allocated to the observational studies. Designing and doing studies based on scientific guidelines that include the entire process, leads to studies validation and also results can be generalized to the community. Thus, for standardizing ...
متن کاملCodification Mission Statement and Developmental Strategies of Physical Education and Sport Sciences Faculty of Kharazmi University (2014-2018)
Organizations without strategy are like a ships without a compass. The purpose of this study was to Codification a mission statement and strategies developtnem of Faculty of Physical Education and Sport Sciences Kharazmi University in Horizon 1404. Statistical research samples were 15 persons that included the administrators, active and Physical Education experts who were aware of the situation...
متن کاملFinancial Statement Comparability and the Expected Crash Risk of Stock Prices
The purpose of this study is to explain the relationship between the comparability of financial statements as a qualitative financial reporting feature with the expected risk of stock price crash. The statistical population of this research includes all companies admitted to Tehran Stock Exchange. In order to achieve the research goal, 81 companies were selected for the period between 2010 and ...
متن کاملCompliance with Statement of Accounting Standards and Performance of Nigerian Banks
Banks play important roles in promoting national development. In order to provide efficient services and to perform their statutory roles effectively, banks are required to comply with established standards. In Nigeria, the Statement of Accounting Standards (SAS), Companies and Allied Matters Act (CAMA) and the Central Bank of Nigeria’s directives and regulations provide guidelines to banks in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005